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ABSTRACT 



This paper, a discussion of the methodology of matrix sampling, 
and the empirical and theoretical research on matrix sampling, at- 
tempts to demonstrate the following points: 

1. Matrix sampling can be viewed as a simple two factor, ran- 
dom model analysis of variance design, the matrix sampling formulas 
for estimating the mean and variance being simply the point estimate 
formulas for estimating components of the underlying linear model. 

2. These formulas can be based on the weakest possible set of 
assumptions, viz., random and independent sampling of examinees and 
items. No assumptions about the statistical nature of the data need 

be made. 

3. The literature is unclear about what effect the above samp- 
ling assumptions have upon matrix sampling in the estimation of the 

mean and, especially, the variance. 

4. Of the three alternative procedures suggested for dealing 
with negative variance estimates in multiple matrix sampling--equa- 
ting the negative estimate to zero, Winsorizing the distribution of 
estimates, or treating all estimates alike regardless of sign--the 
third procedure appears to be most promising. A simulation study is 
necessary to determine the shape of the distribution of variance com- 
ponent estimates for matrix sampling as well as the relative effi- 
ciency of the three methods for handling negative estimates. 







iMlillili 



INTRODUCTION 



Matrix sampling as a psychometric technique for estimating test 
score parameters is a relatively new technique. The most concise and 
complete discussion of this technique appears in Lord and Novick 
(1968) . The theory used by Lord to derive the matrix sampling estimate 
formulas isj however ^ highly sophisticated and equally complicsLted. 

If matrix sampling were to become a sufficiently useful technique so 
as to warrant its inclusion into a less sophisticated, but more widely 
readable textbook on measurement theory (such as Gullikson, 1950; 

Horst, 196^; Magnusson, 1966)^ the Lord presentation would be un~ 
desirable from the standpoint of its complexity. The present 
formulation relies on the direct application of familiar point 
estimate procedures in the analysis of variance. Since such procedures 
are more widely known and have greater intuitive appeal, the author 
feels that th^y would be more amenable to the purposes of the "average" 
measurement text . 

The material that follows is organized into four sections. The 
first two sections concentrate on describing the technique of matrix 
sampling (with examples) and reviewing most of the literature on the 
theoretical development and empirical validation of the technique. 
Section 3 presents the derivation of formulas for the mean and 
variance estimates using a relatively simple analysis of variance 
design. In section the assumptions underlying the estimate 
formulas are discussed in relation to the use of multiple matrix 



sampling* Emphasis Is given to the negative variance estimate problem 
and procedures suggested to handle this problem. 



1. THE MATRIX SAMPLING TECHNIQUE 



Consider a large high school with, say, 250 students in the 
eleventh grade. Suppose the school administration decides for one 
reason or another that it is interested in knowing how proficient 
(defined in terms of the mean and variance) the eleventh grade is in, 
say, arithmetic fundamentals as measured by some test having, say, 

50 arithmetic fundamental items . 

Obviously, one approach would be to give all 250 students or 
examinees (denoted the population of examinees ) the arithmetic 
test — that is, each examinee would respond to all 50 problems or 
items (denoted the population of items ) . This would amount to 7500 
(250 X 50) examinee-item responses. Depending upon how many examinees 
could take the test at one time and how long it would take to respond 
to each of the items, this testing could amount to a fairly long time 
more time, perhaps, than would be feasible given the schedules of the 
students, personnel, and school in general. 

A second approach, one traditionally used in establishing norms 
for standardized tests, would be to randomly select a sample of 
examinees, say 125, and give them the entire 50-item test -- this 
procedure will be referred to as examinee sampling . Here, the sample 
of examinees ' scores would be used to estimate what the mean and 
variance would have been had all 250 examinees taken the 50-item test . 

A third approach, called matrix sampling , follows the procedures 
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of the second approach, but with one important exception as 

well as examinees, are randomly sampled. For example, the saniple of 
125 examinees might each be given a sample of 15 items. Again, the 
data would be used to obtain estimates of what the mean and variance 
of the arithmetic fundamental scores would have been had the population 
of examinees responded to the population of items . 

Now clearly, the first procedure, that of collecting complete 
data on everybody, would be most desirable. We would not have to 
estimate the mean score of all 250 examinees -- we could, in fact, 
compute the actual mean. The assumption that is being made here, how- 
ever, is that the collection of complete data is not practical for 

various reasons, e.g., lack of time, money, personnel, etc. 

If complete data are rot obtained, then it would seem desirable 
that the sampling procedure employed sample as many items and examinees 
as possible. Thus, it would appear that examinee sampling is 
preferable to matrix sampling since the former sampled half of the 
examinee-item responses in the matrix population and the latter 
sampled only one quarter of these responses. However, if more than 
one matrix sample is strategically extracted from the population 
matrix -- a procedure called multiple matrix sampling — matrix 
sampling can be more representative of the population than any other 
sampling procedure, given fairly stringent economical requirements. 
Figure 1 illustrates this point. The large rectangle represents the 
examinee -by- item population matrix of responses for a population of 
100 (randomly arranged) examinees and 25 (randOTily arranged) items. 
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FIGURE 1 
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The smaller rectangles, arranged diagonally, represent 5 random 
matrix ssunples having 20 examinees and 5 items eachj thus, each 
matrix sample contains 100 examinee-item responses . The shaded 
rectangle represents one possible examinee sample of 20 examinees 
responding to all 25 items -- a total of 500 examinee-item responses. 
Clearly, both the combined matrix samples and the single examinee 
sample require the same number of examinee-item responses (one fifth 
of the matrix population) . However, the matrix samples are more 
representative of the matrix population of examinee-item responses 
than is the examinee sample. Each of the 5 multiple matrix samples 
would yield estimates of the mean and variance of the arithmetic 
fundamental scores . The 5 mean estimates can be averaged to produce 
a final estimate of the mean; the 5 variance estimates can be averaged 
to produce a final estimate of the variance. 

By way of summary, the fundamental methodological advance of 
matrix sampling is this: every examinee (from a finite or conceptually 

infinite population of examinees) need hot respond to every item (from 
3 , finite or conceptually infinite population of items) in order to 
obtain estimates of the moments of the distribution of the population 
of examinees ’ responses to the population of items . This paper will 
be concerned only with the first and second moments, i.e., the mean and 
variance of the examinee score distribution. Analogous procedures 
can be used to estimate these parameters of the item "score" distribu- 
tion. 

From this description of the technique, it should be clear that 
the more popular name "item sampling" is a misnomer . It is not only 
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items that are being sampled, but examinees are being sampled as well 
In other words, it is a two dimensional, examinee-by-item array of 
responses that is being sampled from the population examinee-by-item 
response array. For this reason, the older term "matrix sampling" 
(see review of Lord's initial papers in section 2) is used in this 



paper . 



2. REVIEW OF THE LITERATURE 



The research available regarding matrix sampling can generally 
be divided into two classes : that concerning the empirical validation 

of matrix sampling and that dealing with the theoretical development 
of matrix sampling. This chapter will briefly review these two 
classes of literature in that order. 

Most of the empirical research on matrix sampling has consisted 
of studies attempting to verify that matrix sampling does what it is 
intended to do. These studies have followed one basic paradigm: 

1. Obtain the entire matrix population of responses, ttius 
obtaining the actual values of the population parameters to be 
estimated. 

2. Generate parameter estimates using both the multiple matrix 
sampling and the more traditional examinee saa^llng methods . 

3. Compare the matrix sao^llng estimates to those of the 
examinee samples in terms of closeness to the actual population values . 

The first such study (Lord, I962) employed a 70 item test and 
1000 examinees. All 70,000 examinee-item responses were obtained, and 
the mean and variance of examinee test scores were calculated. Then 
10 matrix samples of 7 items and 100 examinees each were randomly 
generated; the separate matrix sample estimates were averaged, yielding 
final estimates of the population mean and variance. Also, the 100 
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exsjDlnees In each sample were scored on all 70 Itons^ creating 10 
examinee samples; for each of these samples ^ mean and variance esti- 
mates were obtained In the usual manner. comparison^ the matrix 
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sampling estimate of the mean was closer (in absolute difference) to 
the population mean than were 7 of the 10 examinee sampling estimates; 
the matrix sampling estimate of the variance was closer to the 
population variance than 5 of the 10 examinee sampling estimates . Lord 
points out that one reason why these results are not more strikingly 
in favor of matrix sampling is that the item samples were drawn with 
replacement — that is, they were not nonoverlapping as a more effi- 
cient design would dictate. 

Plumlee (1964) followed the basic paradigm with a 30 item test 
and 200 examinees . Although nonoverlapping matrix samples were used, 
the matrix sampling variance estimate was closer to the population 
value than only 1 of 10 examinee sampling estimates . Matrix sampling 
estimated the mean, however, better than all but 2 examinee samples. 

Cook and Stufflebeam (I967 or I967) extended the paradigm for 
validating the estimation procedures of matrix sampling by using 
variable sized matrix samples on both the item and examinee dimensions . 
Their results generally support the findings of the above studies . 
Again, the variance was not as well estimated as the mean by matrix 
sampling . 

Husek and Sirotnik (I967) followed the above paradigm but with 
two different kinds of tests: an achievement test designed to maximize 

variability among subjects and an objective -meeting test designed to 
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mlnlffllze variability among subjects. For the achievement test data^ 
matrix santpling was more efficient without exception than the examinee 
samples. For the objective-meeting test; matrix saiopling estimated 
the mean better than 4 and the variance better than 5 out of 5 examinee 
saii^les . It was concluded that the efficiency of matrix sampling 
might be dependent upon the purpose for which the test was intended. 

Cahen, et al. (1967) used a different design to investigate 
the efficiency of matrix sampling estimates . Matrix sampling data 
(not the matrix population) was collected initially for a 50- item 
test on the first day of testing. On the second day, the entire 
matrix population of data was obtained for a "nominally" parallel 
50-item test . Comparisons were then made between the matrix sampling 
estimates of the mean (variance estimates were not considered) of the 
first day with the population mean obtained on the second day. 
Discrepencies were discussed in relation to varying testing time 
limits and examinee sample sizes . 

The theoretical literature will now be considered. The concept 
of matrix sampling as a psychometric technique for estimating popula- 
tion score parameters from partial data apparently originates with 
Fredrick Lord. In a series of five publications, Lord discussed 
the technique under several different names and within different but 
related contexts. 

In 1955, Lord referred to matrix sampling as Type 12 sampling, 
a logical extension of Type 1 sampling (the sampling of examinees) 
and Type 2 sampling (the sampling of test items), with primary emphasis 
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on resulting standard errors of measurement . Matrix san5)ling was the 



term used by Lord (1959a) wherein the main concern was with the 
estimation of various moments of the distribution of examinee true 
scores and the relationship of true scores to observed scores. The 
term item sampling was introduced by Lord (1959^) referring again to 
the same process of matrix sampling ^ which was discussed as one of 
several possible true score models available in mental test theory. 

In i960, Lord showed how the "item sampling" model could be used to 
estimate (a) true score distributions on lengthened and shortened 
forms of a given test and (b) the relationship between observed scores 
on two parallel test forms given data on only one form. Finally, Lord 
(1965) discussed the concept of matrix sampling explicitly in terms of 
a data gathering procedure. Most recently (I968), this last work has 
been revised and incorporated as a comprehensive chapter on "item 
sampling" in Lord and Novick's text on mental test theory. 

In order to present matrix sampling in a rigorous and generalized 
framework, Hooke (1956a, 1956b) developed an algebra involving 
symmetric polynomials of the elements in a matrix. The functions are 
called generalized symmetric means (gsm's) and have the property of 
being inherited on the average, i.e., the expected value of a gsm in 
a matrix sample is equal to the same quantity in the matrix population. 
Certain linear combinations of gsm's, called bipolykays, turn out to 
be estimates of the moments of the matrix population, thus providing 
a convenient way to obtain formulas for the estimated examinee score 
mean and variance from a matrix sample. Appendix 5 contains a brief 
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introduction to Hooke’s formulation and how it is used by Lord (19^5 
or Lord and Novick, I968) to derive the following formulas for the 
estimated mean and variance (see Appendix 1 for notational 
definitions ) : 

(1) ii = X for finite and infinite matrix 
populations 

for finite matrix populations 

(5) ^ = (n-lKm-l T 



matrix populations . 



3. ANALYSIS OF VARIANCE FORMULATION OF MATRIX SAMPLING 

The main puirose of the present paper is to derive the formulas 
(1, 2, and 5) given at the end of the previous chapter using a simple 
examinees-by-items, repeated measures analysis of variance design. 
Although the use of analysis of variance procedures to obtain first 
and second moment estimates has been suggested by Lord and Novick 
(1968), the above formulas have never been explicitly derived without 
the use of bipolykays. Since Hooke's formulation is relatively 
complex and difficult to readily follow, the author feels that the 
subsequent presentation provides a convenient and simple exposition 
of the actual use of matrix sampling vlth perhaps a more intuitive 
feeling for the above formulas . Also, certain empirical problems 
resulting from the use of multiple matrix sam 5 >llng, not explicitly 
clear in Lord's presentations, appear to be more easily discussed 

in terms of the present framework. 

The two dimensional array in Figure 2 (taken in conjunction with 

the notation given in Appendix 1) defines the quantities which are used 
below. This array represents a typical matrix saimile of n examinees 
and m items drawn independently and at random from corresponding 
populations of N examinees and M items . This array can also be 
considered as an n x m factorial design with one observation per 
cell. That is, an n-level examinee factor (E) is completely 
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crossed (measured repeatedly) over the m-level item factor (l). The 

observation in any given cell represents an examinee -item response 

which can be either binary (O or 1) or nonbinary (l, 2, etc.)* ^ 

Furthermore, E and I are random factors, i.e., the levels 

(examinees) of factor E as well as those (items) of factor I are 

randomly selected from corresponding populations of levels. (N and 

M, the corresponding population sizes, can be either finite or 

infinite, depending upon the model viewed most reasonable by the 

researcher.) 

In order to deal with the various sources of variability in 
this design, the following linear and additive model is traditionally ] 

used (Winer, 1962): i 







where 



|i = general level effect equivalent to the matrix 
population mean 



= (A^ - U) = examinee i effect in the popula 
tion of examinees 



'n'. = (IT. - li) - item j effect in the population 

tJ J 



of items 

e. = (X. . - X. - 7 T + |i) = rei^idual effect assumed 

1 J 



to be due only to error of measurement . 
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2 

In partitioning the total variability of the 3^ into sources 
corresponding to the coarponents in (4) and deriving their expected 
values (see Table l), the following assumptions are made: 



i) 



ii) 



X , F , and the e. . are independent, random variables 

i j 

2 2 2 * / 

with means of zero and variances of cj^, (Note 

homogeneity of error variances . ) 

In view of the additivity of the model (no interaction effect), 
homogeneity of covariance is assumed among the population of 
error-free items. (This is equivalent to assuming that the 
item intercorrelation or covariance matrix has rank 1 -- with 
the exception of those populations of items where the inter- 
correlations consist only of both perfect positive (+1) and 
perfect negative (-1) correlations.) 



2 

Our main concern is to obtain estimates of and denoted 

2 and S?, from the matrix sample data. (The estimate of item mean 

A 

variance cr^ can be obtained using analogous procedures . ) This 
can be done as usual, selecting the appropriate E[MS] and solving 
for the desired variance component using the appropriate MS estimates. 

For the estimate of l-i, the expected value of both sides of (4) 
is taken as follows : 



E[X^^] = E[^ + \ 

= |i + 0+ 0+ 0 (by assumptions in i). 



^It should be emphasized that the assumption that these effects are 
normally distributed is not necessary in order to derive component 
of variance estimates . 
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Hence, 

(5) 



2 = X , 



that is , the mean of the matrix sample is an unbiased estimate of the 



mean in the matrix population, which is the same as formula (l) 



above, given by Lord. 

2 



To estimate cr^, the SS in Table 1 are first converted into the 
more familiar matrix sample statistics as follows : 



^ 2 

(6) SS- = nms,^ where s is the sample variance of examinee 

^ y y 



mean scores 



2 2 

(7) SS- = nms where s is the sample variance of item mean 

J. p p 



scores 



""2 2 ""2 

(8) SS„ = nmSj - nms where s . is the average item 
^ J y J 



variance in the sample. 



(Proofs of these equivalencies are found in Appendix 2, Proofs 1, 2, 



and 5») We can now solve for the desired component of variance 



a2 



estimate cr^. From Table 1 it is evident that 



2 E[MS ] - (1 - ^)E[MS ] 



(9) 



= 



E 



M 



m 



Using corresponding MS as estimates, we have 






(10) 



^ 

m(n - 1) ” m(n - l)(m - 1) 
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Making substitutions using the above equivalencies (6 and 8) for the 
SS| we have 



2 



(11) = 



nmSy ■ «“Sy) 

m(n-l) m(n-l)(m-l) ^ 

/ , N 2 -2 2 nm^ -2 

nm(m-l)s„ - nms . + nms + 

Jl d y ^3 



m(n-l)(m-l|[ 



nm^ 2 
M ®y 



nM(m-l)s^ - nto? + nMs^ + nms? - nms' 

-2C —1 " y d : 

M(n-l)(m-l) 

**8y[M(m-l) + M - m] - nSj(M-m) 
M(n-l)(m-l) 



= M(n-l)(m-l) 



At this point it must be noted that the use of sigma in (and a^) 

is valid only if this population variance is defined as follows: 



( 12 ) 



N 

= ^ IjAk - 1) . 



Since we are interested in the quantity more usually defined as 
X^/N, the estimate given by (11) must be corrected by a multi- 
plicative factor of (N - 1 )/n. Making this correction, and allowing 
the use of the same sigma, (11) becomes 
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This formula will hold in general, whether or not the items are 
binary. When the items are binfiury, we note the following relation- 
ship : 



mO mm mm P 

(l4) s^ = y(l - y) - s . (See Appendix 2, Proof U.) 
w P 



Substituting (lU) into (13) we obtain 




which is exactly the same as formula (2) given above by Lord for 
the case in which both examinee and item populations are finite. 

For the case in which both N and M are infinite, we can 
either (a) take the limit of (13) as both N and M approach 
infinity or (b) note that the E[MS] of Table 1 take the following 
forms and perform a derivation analogous to that from (9) to (13) 
above : 





EtMSj,] = 0-g 



In general, then, for infinite populations we have 




(16) 



If the items are dichotomous, then 






This is exactly the same as Lord's formula (formula ( 5 ) above) for 
the infinite case. 



Formulas for in the finite and infiAlte case are symmetric 

to those for cr? . 



It can be easily argued that the foregoing model was unnecessarily 
restrictive in terms of the assumptions made. In the opening para- 
graph of his chapter on "item sampling" (Lord and Novick, I968), 

Lord states that 

This chapter deals with the case where 
the . . . test items are a random sample from 
the population of items. This item-sampling 
model makes no other assumptions about the 
nature of the test. (p. 25 ^) 



He later states that the examinees are a random sample from a 
population of examinees and makes the further assumption that the 

. . . sample of items and the sample of 
examinees are drawn independently of each 
other. (p. 256) 

S 

In discussing the problems of 'Estimation using this kind of model, 
Lord states that (author's symbols substituted for Lord's) 



In many of the usual, simple types 
of estimation problem, a population is 
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completely specified by a convenient 
univariate frequency distribution. Many 
powerful estimation methods are available 
for such problems . Similarly a matrix 
population could be specified by an M- 
variate or an N-variate frequency distri- 
bution, provided that an appropriate and 
convenient mathematical form could be 
found ... . 

In the absence of an adequate para- 
metric form for a frequency distribution, 
how are we to describe a matrix population 
without using a huge number of parameters? 
(pp. 257-238) 



Lord goes on to answer this question with the presentation of Hooke's 
formulation of matrix sampling, which rests in fact only on the 
assumption of independent and random sampling of examinees and items . 

It is well known that point estimates of the variance components 
for the model given by (4) do not require any assumption regarding 
the shape of the distributions (see footnote, p. 17)« fact, this 



parametric model with several modifications can be used with the 
same, weak assumptions of the Hooke approach. This can best be 
seen by gradually relaxing the restrictions put on (4) for deriving 
the e 3 ^ected mean squares. 

One severe restriction in the model was the assumption of 
additivity, viz., the lack of provision for an interaction effect 
apart from error. Eliminating this assumption, the model given by 
(4) can be slightly modi|’ied to produce the following linear and 
nonadditive model : 



( 18 ) 
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where 



an additional independently distributed random 
variable representing the interaction effect 
between examinee i and item J in the 
matrix population. 



We now possess a model where the form of the error-free item co- 
variance matrix among items in the peculation is arbitrary, i.e., 
there is no necessity for assuming the matrix to be of rank 1. 

But with the addition of an interaction effect, the unrealistic 
assumption of its independence of examinee and item effects is made. 
Furthermore, the restriction of homogeneity of error variance is 
still present. Cornfield and Tukey (1956) have demonstrated that 
expected values of mean squares can be derived using what is referred 
to as a "pigeonhole" model. Specifically, for the two-way class- 
ification the authors describe the model as follows (corresponding 
symbols of the present paper are substituted for those of Cornfield 

and Tukey) : 



Let there be NM pigeonholes arranged 
in N rows and M columns . Let there be at 
least R elements in the population in each 
pigeonhole. Let a sample of n rows be drawn 
from the N potential rows. Let a sample of 
m columns be drawn from the M potential 
columns. The nm intersections of a selected 
row with a selected column specify the nm 
pigeonholes which become the cells of the actual 
experiment. In each of these nm cells, let a 
sample of r elements be drawn. The values of 
the nmr elements thus obtained are the numbers 
which are to be analyzed. Assume that all the 
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sanmiings — of rows t of columns , and within 
pigeonholes — are at random and independent 
[ of one another . This is the only assumption 

we shall make. Note that it is an assumption 
: about the set-up of the e^q>eriment and not 

about the behavior of those things on which 

; the experiment is performed. (p. 909) 

\ 

They go on to point out the generality of their model in that (a) no 
constant variance is assumed for the cells and (b) no assumption is 
[ made about interaction, i.e., the interaction effects are dependent 

upon the particular rows and columns that happen to be sampled, 
j Now consider Table 2 which presents the most general form of 

} the E[MS] for this model. Vlhen r « 1 in this pigeonhole model 

(see Table 5), we have the matrix sampling situation, but based only 
on the above assumptions. 

\ 

At this point, a subtle and important conceptual problem arises. 

How are we to treat the MN "populations" of replications of size 

R, given that r will always equal one? Three distinct choices 

are available: (a) R = », (b) R finite and greater than r, or 

(c) R finite and R = r = 1. We must also be concerned with our 

} treatment of M (as finite or infinite) since it enters into the 

! E[MS„]. From Table 4 it is evident that (a) when M is finite, R 

^ £ 

must be finite and equal to one in order that be computed exactly 

' 2 

i and (b) when M is infinite, the exact estimate of cr^ can always 

be computed regardless of the value of R. (in Table k both M and 

{ 2 

N are treated alike to illustrate the analogy for estimating 

'i 

In any case, it is clear that the explicit use of analysis of 
variance estimation procedures can be used to derive formulas 
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TABLE 2 
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Source 



E 



El 



Error 



Source 



E 



E 



T 



' ' r ■« «< Ji. 



General Form of the E[MS] 
for the Two-Factor Design 



E[MS] 



o / 2 2 

(1 - r/R)cr + r(l - m/M)a^^ + rma^ 



(1 - rfB.)<j\ + r(l - + rna^ 



(1 - r/i))<rg + 



ra. 



Xtt 



(1 - r/R)a' 



TABLE 3 



The E[MS] of Table 2 when r = 1 



E[MS] 



(1 - 1/R)ag + (1 - 



(1 - 1/R)ag + (1 - n/N)iTxTr “4 



(1 - 1/R)(jg + 



'Xir 
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TABLE 4 



E[MS] of Table 3 for Indicated Values of M and R 

M (and N) finite 



Source 
R infinite 
E 
I 
El 

R finite; R > r 
E 
I 

El 

R finite; R = 1 
E 
I 

El 



E[MS] 



<^1 - (1 



2 2 
m/M)c7^^ + mcr^ 

2 

°’X7T 



(1 - i/R)a| + (1 
(1 - 1/R)<7® + (1 

(1 - 1/R)(7g + 

(1 - m/M)(7®^ + ma^ 

(1 - n/N)^^ + nal 

2 

^\TT 



2 2 

- “°’X 

- n/N)(7^^ + n4 

2 

°’X7T 
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Source 



El 



R finite; R > r 



El 



R finite; R = 1 



El 
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TABLE 4 (continued) 



M (and N) infinite 



E[MS] 



2^2^ 2 
cr + + mcr. 

e Xtt X 



2 2 2 

cr + + no^ 

e Xtt 7T 



2^2 
cr^ + cr^-r 
E Xtt 



(1 - l/R)o-g + 0-^^ + ma\ 



(1 - 1/R)<t^ + + n<T^ 



7T 



(1 - 1/R)ct^ + 0-^ 



Xtt 



2^2 
'^Xir ^ ”“^X 



ctn_ + nc7_ 

Xtt 7t 



’Xtt 
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equivalent to those of Lord without making any stronger assumptions. 
Although the derivations for the variance estimate in this section 
were based on the strong model given by (U), exactly the same algebra 
would be involved for any of the models in Table U. However, the 
usual procedure of taking the expected value of both sides of (18) 
to obtain an estimate of |-i cannot be done in view of the weak as- 
sumptions made (e.g., not assumed to be statistically 

independent of the remaining effects). It can easily be shown, 
however, that |i = X under the sampling assumptions made (see 
Appendix 2, Proof 6). 



4. DISCUSSION 



I 



I 

i 



I 

I 






I 




This section will attempt to coordinate the theoretical 
development of matrix sampling given in the previous chapter 

with the actual use of, and problems with, the technique 

when applied to psychometric data, ^ecifically, the discussion will 

center around the use of multiple matrix saii5)ling, the possibility of 

obtaining negative variance estimates being given particular attention. 

Miltiple matrix sampling (briefly discussed in the first section) 
is the process of randomly drawing more than one matrix sample from 
a matrix population, computing the desired parameter estimate from 
each sample, and combining these estimates to produce one final, 
more stable estimate. There are at least three ways in which sampling 
from the item population can be systematically accomplished: (a) 

sampling with replacement, i-©., any given item sample can be drawn 
more than once, (b) sampling with "restricted replacement," i.e., 
any particular item sample cannot be drawn more than once but any 
given item can appear in more than one item sample, or (c) sampling 
without replacement, i.e., no item or item sample can be drawn more 

than once. Since the same remarks apply to the sampling of examinees, 
there is a total of nine different ways to draw matrix samples from 
the matrix population. 

Unfortunately, Lord’s discussion of multiple matrix sampling 
is rather vague from both theoretical and methodological standpoints . 
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In fact, the only discussion of multiple matrix sampling approaching 
some degree of rigor deals only with estimating the population mean 
(Lord and Novick, I968). The author can find no such discussion 
regarding the estimation of the population variance; yet the procedures 
of multiple matrix sampling have been employed for both estimates, 
starting with Lord (I962) . To be more specific, in section 11.12 
(Lord and Novick, I968) Lord offers fwith no formal proof) the 
following statements (corresponding symbols of the present paper 
have been used in place of Lord's): 



The methods of the preceding section 
[estimating a mean from a single matrix 
sample] do not malte full use of the ad- 
vantages of item sampling . Under a more 
efficient procedure to be outlined in this 
section, the examiner administers different 
samples of binary items to different sub- 
groups of examinees . This procedure draws 
on a mathematical formulation that has 
arisen from certain unpublished suggestions 
of Dr . William W . Turnbull . 

Suppose K nonoverlapping random 
samples of m binary items each are drawn 
(without replacement) from an M-item test 
and treated as separate subtests; it is not 
required that K = M/m or K = N/n. A 
different subtest is administered to each 
of K nonoverlapping random samples of n 
examinees drawn from a population of N 
examinees . If is the mean relative 

score of subgroup _k on an m-item test, 
then the average Xj^ is an unbiased 
estimater of |i, the mean score of the 
N examinees on the M-item test. (p. 255 ) 



No parallel theorem is stated for variance estimates, yet the 

a2 

above procedure has been used for the as well as the X^ of 

the matrix samples (see Chapter 2 starting with Lord, I962) . Consider, 
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however ; the following statements In section 11 .l4 (Lord and Novicky 

2 

1968) regarding the estimation of for finite populations ; 



As we saw In section 11.12, It Is much 
better to administer many different m-ltem 
subtests than just one. If this Is done, It 
Is again Important . . . that every Item appear 
an equal number of times; that all [possible] 
pairs of Items appear In the subtestsi, . If 
possible; and that each pair be administered 
to the same number of examinees . When all 
[possible] pairs can not be used, good balanced 
designs may sometimes be found with the help of 
tables of balanced Incomplete blocks.... (p. 259) 



(Knapp (in press) gives a detailed discussion of the application 
of balanced Incomplete block designs to the estimation of the mean 
and variance.) Clearly, If these above criteria are satisfied, then 
It Is Impossible to have nonoverlapp Ing sauries of Items . 

The above conflicting points of view and practice must further 
be reconciled with the following statements (Lord, 1962 — see 
discussion of this paper In Chapter 2): 



In retrospect, the foregoing Item- 
sampling procedure Is seen to have been 
unnecessarily Inefficient. Items were 
sampled with replacement after each sampling 
for the reason that such sampling Is effec- 
tively the same as sampling from an Infinite 
pool of Items, and the available formulas In 
Lord (i960) for utilizing the resulting data 
are discussed In terms of sampling from an 
Infinite pool. It would have been better to 
sample without replacement, thus dividing the 
70 items at random into 10 overlapping 7- item 
tests. Hooke's (1956a) basic derivations 
show that the same formulas would be valid 
for such sampling without replacement . 

(pp. 261-262) 



The author can find no discussion of multiple matrix sampling 
in Hooke (l9^6a or b) . Perhaps Lord was simply referring to the 
fact that theory was available for sampling a matrix sample when 



the examinee and/or Item pqpulation Is finite. As seen in section 3, 
the Cornfield and Tukey (1956) approach supplies the same finite 
sampling theory. In both presentations > the process of selecting 
more than one matrix sample is discussed only in the context of 
defining the "inherited on the average" property. For example > 
consider the following definition by Hooke (1956a): 



Let Xj (l =1,2,...,N) be any pop- 
ulation of N numbers# and let x^ (1=1, 
2, ...,n) represent elements of a sanple of 
size n from this population. Let 
f(n; x^,...,x^) be a polynomial which is 
symmetric in the x^ and has coefficients 
which are functions of n. Such a function 
extends obviously to a polynomial f(N; 
x^,...,Xj^), the coresponding symmetric 
polynomial in the x^., with the coefficients 
changed only by replacing n by N. Writing 
"ave" for the operation of averaging over all 



Clearly, this type of sampling is the second type referred to above 
as sampling with restricted replacement. It would seem appropriate 
to stick to this type of sampling if estimates from multiple samples 




j distinct samples of size n from the 
population, we say that f(n; x^,...,x^) is 
'inherited on the average' if 
ave f(n; x^,...,x^) = f(N; x^,...,Xj^) . 



(P. 55) 
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were to remain unbiased. In this case> nonoverlapping sampling 
(the sampling of matrices without replacement) would be an invalid 



restriction. 



This point can perhaps be better illustrated from the. analysis 
of variance view point and the Cornfield and Tukey (1956) approach. 



In order to derive expected values of mean squares, the only 



assumptions made were that rows and columns were randcmly and 
independently sampled from their corresponding populations. In 
multiple nonoverlapping matrix samples, only the first sample 
satisfies these assunptions . The fact that these rows and columns 
(examinees and items) cannot be included in subsequent matrix samples 
Imposes a dependence and non-randomness on the rows and columns 



sampled in subsequent matrix samples . 



The author does not know what effect the restriction of non- 



overlapping matrix samples has on the resulting parameter estimates. 
Although the k th (k > l) matrix sample will not strictly conform 
to the above sampling assumptions, it might be argued intuitively 
that the items and examinees of this sample are unbiased in the sense 



that they would have had the same chance of being selected the 
outset as those of any other sample. The parameter estimates might 



then be considered in the same sense as being unbiased. 



The reason for heavy concentration on the theory thus far lies 



in the following fact: It is possible that the variance estimate 

generated from the matrix sampling formula is negative. Recognition 
of the fact that variance component estimates can be negative is not 
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unique to the present paper. Thanpson (I 962 ) in regard to variance 
component estimates makes the following statements : 



The traditional estimators . . . are 
obtained ... by equating the mean-squares 
to their expectations and .solving. Clearly 
the traditional estimate ... may be negative; 
should this occur, we do not believe that any 
such statistical analysis would become useful 
until it is decided what to do with the nega- 
tive estimate. This, then, is an example of 
what we mean by 'the problem of negative esti- 
mates of variance components Two possible 
explanations of a negative estimate present 
themselves: (1) the assumed model may be 

Incorrect and (2) statistical noise may have 
obscured the underlying physical situation. 

(p. 27 ^) 



Husek and Sirotnik (19^) actually obtained a negative variance 
estimate while conducting a study (see section 2) involving multiple 
matrix san^llng from an already knoim matrix population of data. 

Ijy setting Lord's formula less than zero and simplifying the result- 
ing inequality, they showed that a negative variance estimate would 
be obtained whenever 



ms 



2 

y 




2 

s 

p 



or, alternatively, whenever 



a < 0 

where a is Cronbach's (1951) generalized coefficient of internal 
consistency, (if the items are dichotomous, 0£ = K - R 20, the 
Kuder-Richardson (1937) coefficient.) 



In the present framework, this result is relatively trivial. 

Hoyt ( 1943 ) showed that the repeated measures anal^^is of variance 
design could be used as an alternative approach to obtaining a 
measure of internal consistency equivalent to that of K - R 20. 

(See Appendix 2, Proof 5 for a derivation of this fact for the more 
general case of Cronbach's coefficient alpha.) In the most general 
form, Hoyt's result can be written as follows: 

(19) a = (MSg - MSjj,)AlSj, . 

Ely substituting the E[MS] for the MS in this equation, it can 
easily be seen that the ratio is a ratio of true score variance to 
the total true score plus error variance. Although the above formula 
was originally derived using the strong analysis of variance model 
first presented in section 5, it is just as valid using the weakest 
model, viz., the Cornfield and Tukey (1956) pigeon-hole approach. 

This point was made by Cronbach, Rajaratnam, and Gleser (I 963 ) who 
attempted to free reliability theory from the concept of "parallel" 
measures . They redefined reliability in terms of generalizing from 
a sample of observations to a universe (or population) of obse:rvations . 
They did not wish to be restrained by any statistical characteristics 
of the item (or examinee) population (e.g., unit rank, no systematic 
examinee - by - item interaction, equality of error variance); hence, 
the Cornfield and Tiikey approach provided the needed theoretical 
framework . 

Now, returning to (I9), it is clear that whenever MSj, < MS^, 
a < 0. If we substitute the equivalence relations given in section 3 
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for the sums of squares into this inequality, we obtain 



nms 



( 20 ) 



2 

nms^ - nmSy 
n-1 (n-i)(m-l) 



(m-l)sj < 



2^2 2 

ms < - s , 

y X p 



which is exactly the relationship found by HUsek and Sirotnik (I968) . 
The relationship between and a can be seen directly by first 

noting that 



( 21 ) 



a 2 



■"x- 



mSe - 



MS 



IE 



m 



and then combining (I9) and ( 21 ) yielding 



( 22 ) 



MSe 



= ^ m 

A m 



2 

The point, again, is this: Since ( not c^) can never be 



negative; when < 0 , then either (a) the theoretical assumptions 
underlying the derivation of the E[MS] have been violated or (b) 
we have been victimized by extreme sampling fluctuation. With 
respect to the first explanation, the literature is not clear on 
what the effect (if any) is on variance component estimates when 
matrices are sampled without replacement. Lord states (Lord and 
Novick, 1968) that the estimates of the population mean from such 
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samples are unbiased. He implies that the same holds true for esti- 
mates of the population variance (Lord, I962) . 

Suppose the second explanation is accepted. That is, suppose 
a researcher is using multiple matrix san?)ling to establish norms* 
on some test and he obtains a negative variance estimate from one 
or more matrix samples . The researcher is in the position of having 
to deal with these negative Yiilues in some kind of averaging process 
to arrive at a final estimate. Consider the following three possible 
approaches, presented in decreasing order in terms of mathematical 
justification and increasing order (in the opinion of the author) 
in terms of reasonability. 

It is common practice to regard negative variance component 
estimates as, for all practical purposes, zero. Mathematical 
justification for this practice stems from the fact that the 
maximum likelihood estimate of is zero when > MS^ under 

the restriction of positive component estimates (Thon^json, I962) . 

But consider the following remarks by Scheffe (1959): 



It may happen with positive probability 
that the estimate of a variance component is 
negative ... Since the estimated parameter 
is nonnegative, the estiniate is sometimes 
modified by redefining it to be zero when it 
is negative ... We prefer no^ to use such 
modified estimates: their distribution 

theory is more complicated . . . and the 
modified estimates are biased. (p. 229) 



Sometimes when the researcher is willing to break away from his 
search for the mathematically rigorous solution, he can stumble upon 
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more non-rigorous data analytic methods which might correspond more 
closely to the behavior of the real world. (Such correspondence 
would, of course, have to be validated empirically.) As Tukey (1962) 
puts it in his discussion of the future of data analysis. 



We should seek out unfamiliar summaries 
of observational material, and establish their 
useful properties.... Many seem to find it 
essential to begin with a probability model 
containing a parameter, and then to ask. for a 
good estimate for this parameter (too often, 
unfortunately, for one thkt is optimum) . 

Many have forgotten that data analysis can, 
sometimes quite appropriately, precede prob- 
ability models , that progress can come from 
asking what a specified indicator (= a 
specified function of the data) may reason- 
ably be regarded as estimating. Escape from 
this constraint can do much to promote 
novelty. (p. 5 ) 



Lindquist (19^6) states that variance component estimates are 
approximately normally distributed when the degrees of freedom 
involved are very large. Clearly, in matrix sampling n and nm 
are apt to be fairly small. The author knows of no research 
regarding the small sampling distribution of variance component 
estimates. The negative variance estimates in matrix sampling data, 
however, suggest that at least one end of this distribution is 
rather long-tailed. Making the assumption that this distribution is 
approximately symmetrical, the other end can be regarded as long- 
tailed, produced by extremely high con?)onent estimates. In other 
words, in a distribution of variance component estimates obtained 
from small samples, any negative estimates (and an equal number 
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of largest positive estimates) might be regarded as outliers and 
not representative of the population parameter. 

A simple and intuitive procedure for handling these outliers is 
to "trim" or "Winsorize" (Tukey, I962) the distribution ot estimates 
before averaging to produce the final estimate. A trimmed distribution 
is one where an equal number of lowest .and highest outliers are 
eliminated from the distribution. In a Winsorized distribution, 
these outliers are forced equal to the remaining lowest and highest 
observations respectively . 

Specifically, suppose k ordered variance estimates have 

been obtained by multiple matrix sampling and t of these are 
negative. Then the mean of the Winsorized distribution of these 
estimates would be given as follows: 



Just what the shape of the sampling distribution of variance 
component estimates is must be settled empirically, starting with 
computer simulated data and corresponding distributions of mean 
variance component estimates using both maximum likelihood and 
Winsorizing approaches . Depending upon the shape of obtained 
distributions of variance component estimates for this type of small 
sampling, various non-symmetrical Winsorizing approaches (Dixon, 
i960) might also be compared. 

The third approach to be suggested will be prefaced by the 
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following discussion regarding confidence intervals for variance 
component estimates by Scheffe (1959) J 



Some discussion is required because 
one or both end points of the interval may 
be negative while the true value of [the 
parameter] is of course nonnegative... 



It would be mathematically correct to 
modify the interval so that if the left end 
point is negative it is replaced by zero 
and if the right end point is negative it 
is also replaced by zero... 



Although there is nothing in the formal 
theory of confidence intervals to Justify it, 
most users of confidence intervals have a more 
or less conscious feeling that the length of 
a two-sided confidence interval is a measure 
of the error of some point estimate of the 
parameter . . . 



In light of the above discussion we see 
that if the interval is considerably shorten- 
ed by deleting the part, if any, to the left 
of the origin, a misleading impression of the 
accuracy of the estimation may result . If the 
interval is completely to the left of the 
origin one might consider translating it until 
it Just includes the origin, to meet the above 
objection to shortening it. However, one 
might again feel on nonmat hemati cal and in- 
tuitive grounds that an interval estimate 
like that from -5 to -5 is stronger 
evidence that the true value of a nonnegative 
parameter is zero than that from -2 to 0. 

(pp. 229-251) 



Extrapolating from Scheffe 's argument to the situation in multiple 



matrix sampling, one might intuitively feel that the magnitude of 
variance component estimates carries with it important information -• 



regardless of whether it is positive or negative . Thus, it would 
seem reasonable to average these estimates without any modifications 



whatsoever . 
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Table ^ presents the ten matrix sample estimates obtained in the 
Ifiisek and Sirotnik (I968) study (see section 2 ). Averages obtained 
under the three possible ways of handling the negative estimate are 
also presented. It can be seen that equating the negative estimate 
to zero or symmetrical Winsorizing produce nearly the same result . 
Averaging in the negative estimate, however, yields a final estimate 
which is substantially closer to the actual population value. Again, 
a simulation study is needed to evaluate the relative merits of the 
three proposed procedures . 



kl 







TABLE 5 

A Comparison of Three Alternative 
Procedures for Handling Negative Variance 
Component Estimates* 

Data taJsen from Husek and Sirotnik (I968) 



Matrix 

Sample 


Obtained 

Estimates 


Equating 
Negative 
Estimate 
To Zero 


Symmetrical 

Winsorizing 


1 


.00645 


.00645 


.00583 


2 


.00583 


.00583 


.00583 


3 


.00479 


.00479 


.00479 


4 


.00412 


.00412 


.00412 


5 


.00384 


.00384 


.00384 


6 


.00276 


.00276 


.00276 


7 


.00191 


.00191 


.00191 


8 


.00179 


.00179 


.00179 


9 


.00074 


.00074 


.00074 


10 


- .00181 


0 


.00074 


Average 


.00304 


.00322 


.00323 




(Population variance = 


.00258) 


Variance 


estimates are 


of examinee mean scores . 
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APPENDIX 1 



Notation 









i 



i 






a 



E[ ] 
k 



K 

K-R 20 




|i, 1-1 
m, M 



. Cronbach’s generalized coefficient of internal 
c ons 1st ency among It ems . 

. Error effect in the examinee-by-item analysis of 
variance design (see p. 15). 

. Expected value operator. 

. Subscript denoting the k matrix sample in multiple 
matrix sampling. 

. Number of multiple matrix samples. 

. Kuder-Richardson coefficient of internal consistency 
among dichotomous Items . 

. Exjaminee effect in the examinee -by- item analysis of 
variance design (see p. 15 ). 

. Interaction effect in the examinee -by- item analysis of 
variance design (see p. 23 ). 

. Mean and estimated mean of the matrix population. 

. Sample and population sizes of the items. 



n, N 



Sample and population sizes of the examinees. 



7T 



d 



V., p 
J 

r, R 



cz 



/^2 

2 /^2 

^7T 



Item effect in the examinee -by- item analysis of 
variance design (see p. 15 ). 

Sample item j mean (X..) and mean of the p.. 

J t) 

Sample and population sizes of the cells in a 2-factor 
analysis of variance design. 

Variance and estimated variance of the . 

Variance and estimated variance of the tt.. 

3 



This appendix is to be used in conjunction with Figure 2 . 
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m 



1 

I 




^Xtt 

2 

2 

8 

y 

2 

s 

P 

2 -2 



Xij. X 



y 



• ft 



• • t 



Variance of the j • 

Variance of the 6^ j . 

Variance of the y^. 

Variance of the Pj. 

Variance of item j (computed over examinees) and 
average of the. Sj. 

The response of examinee i to item j and the mean 



of the X 



±y 



9 9 9 



Sample examinee mean (X^.) and the mean of the 







I 

i 

I 



k6 



rnma^^ 
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APPENDIX 2 
Proofs 



Proof 1 



SSg = m ^ (y^ - X)^ as defined in Table 1 

1 

= mn ^ (y^ - X)^/n 
i 

2 

= mns by definition of a variance 



Proof 2 






as defined in Table 1 





= nms by definition of a variance 
P 



Proof 3 

SSp = V (X. . - y. - p + X)^ as defined in Table 1 

I — I <L_i 1 J 1 J 

i j 



k7 















ERJC 






= I I 

i d 



= 1 1 ("id - "I I ("i * 



But 



i d 



i d 



- " I I ("id • *’d^('^i ■ 



i d 



1 1 '■'ij ■ - 1 " I <* 



Id • Pd^ /“ 



i d 



d i 



-I 



s 



d 



d 



“2 

= nmSj > 



I I (^i - = I " I (^i ■ 



• • 



1 0 



d i 



■z 



ns 



d 



2 

= nms , 

y 
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ERLC 



= X 



-z 



P^/n 



'd' 



Bit 



s 



= I Pj /“ 



or, rearranging terms, 



I p"/n = «p > X2 



Substituting, 






x-x^. 2 



S. 



X(1 - X) - s^ or y(l 

£r 



- y) - 



S 



Proof 3 



nms 



MSg - MSj^ 



n - 1 



MS 



-2 2 
nms^ - 

Cn - l)(m - 1) 



E 



nms 



n - 1 



(m - 



. V 2 -2^2 

l)s - s . + s 

y J X 



(m - l)s 



y 



2 -2 
ms - s . 

X J- 



(m - l)s 



y 
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2 2-2 

ms -ms. 

_S3K i 



(m - l)s 



my 



Q 

(where s = variance of examinee 
my 

total scores) 



% 



4 



4 



i 

f 




my 



= Cronbach 's coefficient alpha 



Proof 6 

To show that ji = X, it must be shown that the mean (X) of 
the sampling distribution of X (generated under the sampling 
assumptions of the model) is equal to li. 

The sampling distribution is made up of the means of 

possible n X m matrix samples from the N X M matrix population. 

r, V -n « • fN-l\ .^M-1 

Each X^^ will appear in ( 

For any given matrix sample, 

X = ~ (sum X. . in that sample) . 
nin ^ 10 

Over all possible matrix samples, then. 



j matrix samples . 
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in all samples) 




= U 
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APPENDIX 5 

A Brief Introduction to Lord's 
Use of Hooke's Approach to Derive 
Formulas for the Mean and Variance Estimates 
in Matrix Sampling 

Let {i^,i 2 >ij, . . .,i^, . . .,i-} be a set of a alternative indices 
for examinees . 

Let * * *' jj' * * •' Le a set of b alternative 

indices for items . 

Let '^ab^ integral 

powers . 

Let be the response of examinee i to item J in the 

n-examinee by m-item matrix randomly sampled from a population N- 
examinee by M-item matrix. (For the following discussion, both N 
and M will be taken to be infinite.) 

Definition ; Denoting a generalized symmetric mean as gsm, 



T 



1 

gsm = “ 





t 



where 



f 



T = n(n-l) •••(n-a+ l)m(m -1) •••(m-b+1) 
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^ denotes the distinctness of the i^ and the . 



™ i e its value is invariant under 

Note ; The gsm is symmetric, i.e., iva 

permutations of rows and/or columns of the matrix in 

Definition : A binolykay is a linear combination of gsm s. 

Theorem : A gsm is inherited on the average, i.e., 

E[gsm in matrix sample] = E[gsm in matrix population] . 



Corollary: A bipolykay is also inherited on the average. 



For convenience, we can specify any given gsm by an "operator 
matrix" whose rows specify the a alternative examinee indices, 
columns specify the b alternative item indices, and elements 



specify the ab integral powers for the corresponding j • Thus, 



the gsm as defined above can be specified as follows : 



^1 


P2 


• • • Pj 

J 


... 


^b+1 


Pb+2 


••• Vj 


Pi 

• • 
• 

• 


Pb(i-1)+1 


Pb(i-l)+2 


••• Pb(i-l)+3 


* * • p • 

lb 


Pb(a-1)+1 


Pb(a-l)+2 


••• Pb(a-l)+j 


1 

CD 

Pi 

• 

• 

• 
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There is only one 1®^ power gsm> namely (zero rows and columns 
are used to achieve uniformity in notation for 1^^ and 2 degree 
gsm's), 



1 0 
0 0 



n m 



- ^11 



= X 



i J 



.nd 



There are four possible 2 degree gsm’s^ for example 



2 0 
0 0 



n m 



■ s E I 



i i 



1 1 
0 0 



K 



m(m 



n n(m-l) 

^ z 



i i)i«2 






1 2 ""2 * 
— r (ms + ny - y) 
m-1 ' y 



' 1 0 

0 1 



m(m-l)n(n-l) 



n(i^l) m(j^l) 



^ 1/^2 



X, , X. , 

il^l 12^2 



-2 



= [(nm - n - m)y - ns^ - + y] 



Applying the expected value theorem for gsm's, it can be shown 



that 








A 

! 0 

If 



and 



Therefore (by the above corollary), the bipolykay equal to the 
difference of the above two gsm’s has the following expected value: 




Hence , 







0 

1 



n 



(n-l)(m-l) 



[ns 



st 

Referring back to the 1 degree 



y - y(i - y) + S^] . 

gsDii it is clear that 
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